Unsupervised Feature Selection Using Nonnegative Spectral Analysis

نویسندگان

  • Zechao Li
  • Yi Yang
  • Jing Liu
  • Xiaofang Zhou
  • Hanqing Lu
چکیده

In this paper, a new unsupervised learning algorithm, namely Nonnegative Discriminative Feature Selection (NDFS), is proposed. To exploit the discriminative information in unsupervised scenarios, we perform spectral clustering to learn the cluster labels of the input samples, during which the feature selection is performed simultaneously. The joint learning of the cluster labels and feature selection matrix enables NDFS to select the most discriminative features. To learn more accurate cluster labels, a nonnegative constraint is explicitly imposed to the class indicators. To reduce the redundant or even noisy features, `2,1-norm minimization constraint is added into the objective function, which guarantees the feature selection matrix sparse in rows. Our algorithm exploits the discriminative information and feature correlation simultaneously to select a better feature subset. A simple yet efficient iterative algorithm is designed to optimize the proposed objective function. Experimental results on different real world datasets demonstrate the encouraging performance of our algorithm over the state-of-the-arts. Introduction The dimension of data is often very high in many domains (Jain and Zongker 1997; Guyon and Elisseeff 2003), such as image and video understanding (Wang et al. 2009a; 2009b), and bio-informatics. In practice, not all the features are important and discriminative, since most of them are often correlated or redundant to each other, and sometimes noisy (Duda, Hart, and Stork 2001; Liu, Wu, and Zhang 2011). These features may result in adverse effects in some learning tasks, such as over-fitting, low efficiency and poor performance (Liu, Wu, and Zhang 2011). Consequently, it is necessary to reduce dimensionality, which can be achieved by feature selection or transformation to a low dimensional space. In this paper, we focus on feature selection, which is to choose discriminative features by eliminating the ones with little or no predictive information based on certain criteria. Many feature selection algorithms have been proposed, which can be classified into three main families: filter, wrapper, and embedded methods. The filter methods (Duda, Hart, Copyright c © 2012, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved. and Stork 2001; He, Cai, and Niyogi 2005; Zhao and Liu 2007; Masaeli, Fung, and Dy 2010; Liu, Wu, and Zhang 2011; Yang et al. 2011a) use statistical properties of the features to filter out poorly informative ones. They are usually performed before applying classification algorithms. They select a subset of features only based on the intrinsic properties of the data. In the wrapper approaches (Guyon and Elisseeff 2003; Rakotomamonjy 2003), feature selection is “wrapped” in a learning algorithm and the classification performance of features is taken as the evaluation criterion. Embedded methods (Vapnik 1998; Zhu et al. 2003) perform feature selection in the process of model construction. In contrast with filter methods, wrapper and embedded methods are tightly coupled with in-built classifiers, which causes that they are less generality and computationally expensive. In this paper, we focus on the filter feature selection algorithm. Because of the importance of discriminative information in data analysis, it is beneficial to exploit discriminative information for feature selection, which is usually encoded in labels. However, how to select discriminative features in unsupervised scenarios is a significant but hard task due to the lack of labels. In light of this, we propose a novel unsupervised feature selection algorithm, namely Nonnegative Discriminative Feature Selection (NDFS), in this paper. We perform spectral clustering and feature selection simultaneously to select the discriminative features for unsupervised learning. The cluster label indicators are obtained by spectral clustering to guide the feature selection procedure. Different from most of the previous spectral clustering algorithms (Shi and Malik 2000; Yu and Shi 2003), we explicitly impose a nonnegative constraint into the objective function, which is natural and reasonable as discussed later in this paper. With nonnegative and orthogonality constraints, the learned cluster indicators are much closer to the ideal results and can be readily utilized to obtain cluster labels. Our method exploits the discriminative information and feature correlation in a joint framework. For the sake of feature selection, the feature selection matrix is constrained to be sparse in rows, which is formulated as `2,1-norm minimization term. To solve the proposed problem, a simple yet effective iterative algorithm is proposed. Extensive experiments are conducted on different datasets, which show that the proposed approach outperforms the state-of-the-arts in different applications. Proceedings of the Twenty-Sixth AAAI Conference on Artificial Intelligence

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Robust Unsupervised Feature Selection

A new unsupervised feature selection method, i.e., Robust Unsupervised Feature Selection (RUFS), is proposed. Unlike traditional unsupervised feature selection methods, pseudo cluster labels are learned via local learning regularized robust nonnegative matrix factorization. During the label learning process, feature selection is performed simultaneously by robust joint l2,1 norms minimization. ...

متن کامل

Spectral clustering and discriminant analysis for unsupervised feature selection

In this paper, we propose a novel method for unsupervised feature selection, which utilizes spectral clustering and discriminant analysis to learn the cluster labels of data. During the learning of cluster labels, feature selection is performed simultaneously. By imposing row sparsity on the transformation matrix, the proposed method optimizes for selecting the most discriminative features whic...

متن کامل

کاهش ابعاد داده‌های ابرطیفی به منظور افزایش جدایی‌پذیری کلاس‌ها و حفظ ساختار داده

Hyperspectral imaging with gathering hundreds spectral bands from the surface of the Earth allows us to separate materials with similar spectrum. Hyperspectral images can be used in many applications such as land chemical and physical parameter estimation, classification, target detection, unmixing, and so on. Among these applications, classification is especially interested. A hyperspectral im...

متن کامل

Unsupervised Spectral-Spatial Feature Selection-Based Camouflaged Object Detection Using VNIR Hyperspectral Camera

The detection of camouflaged objects is important for industrial inspection, medical diagnoses, and military applications. Conventional supervised learning methods for hyperspectral images can be a feasible solution. Such approaches, however, require a priori information of a camouflaged object and background. This letter proposes a fully autonomous feature selection and camouflaged object dete...

متن کامل

Efficient Spectral Feature Selection with Minimum Redundancy

Spectral feature selection identifies relevant features by measuring their capability of preserving sample similarity. It provides a powerful framework for both supervised and unsupervised feature selection, and has been proven to be effective in many real-world applications. One common drawback associated with most existing spectral feature selection algorithms is that they evaluate features i...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012